Cuban Meal Crisis

Identifying ideal locations and hours for operating a Cuban food truck in the city of Los Angeles

Importing necessary libraries

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

#While scraping the wikipedia page an error occured where the absence of lxml was indicated. This statement imports it. Upon installation, Kernel restart was necessary. /

#conda install -c anaconda lxml

print('Libraries imported.')
Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/jupyterlab/conda/envs/python

  added / updated specs:
    - geopy


The following packages will be UPDATED:

  ca-certificates      anaconda::ca-certificates-2020.1.1-0 --> conda-forge::ca-certificates-2020.6.20-hecda079_0

The following packages will be SUPERSEDED by a higher-priority channel:

  certifi                anaconda::certifi-2020.6.20-py36_0 --> conda-forge::certifi-2020.6.20-py36h9f0ad1d_0
  openssl               anaconda::openssl-1.1.1g-h7b6447c_0 --> conda-forge::openssl-1.1.1g-h516909a_0


Preparing transaction: done
Verifying transaction: done
Executing transaction: done
Libraries imported.
In [2]:
conda install -c anaconda lxml
Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/jupyterlab/conda/envs/python

  added / updated specs:
    - lxml


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    ca-certificates-2020.1.1   |                0         132 KB  anaconda
    certifi-2020.6.20          |           py36_0         160 KB  anaconda
    libxslt-1.1.33             |       h7d1a2b0_0         577 KB  anaconda
    lxml-4.5.1                 |   py36hefd8a0e_0         1.4 MB  anaconda
    openssl-1.1.1g             |       h7b6447c_0         3.8 MB  anaconda
    ------------------------------------------------------------
                                           Total:         6.0 MB

The following NEW packages will be INSTALLED:

  libxslt            anaconda/linux-64::libxslt-1.1.33-h7d1a2b0_0
  lxml               anaconda/linux-64::lxml-4.5.1-py36hefd8a0e_0

The following packages will be SUPERSEDED by a higher-priority channel:

  ca-certificates    conda-forge::ca-certificates-2020.6.2~ --> anaconda::ca-certificates-2020.1.1-0
  certifi            conda-forge::certifi-2020.6.20-py36h9~ --> anaconda::certifi-2020.6.20-py36_0
  openssl            conda-forge::openssl-1.1.1g-h516909a_0 --> anaconda::openssl-1.1.1g-h7b6447c_0



Downloading and Extracting Packages
lxml-4.5.1           | 1.4 MB    | ##################################### | 100% 
openssl-1.1.1g       | 3.8 MB    | ##################################### | 100% 
certifi-2020.6.20    | 160 KB    | ##################################### | 100% 
libxslt-1.1.33       | 577 KB    | ##################################### | 100% 
ca-certificates-2020 | 132 KB    | ##################################### | 100% 
Preparing transaction: done
Verifying transaction: done
Executing transaction: done

Note: you may need to restart the kernel to use updated packages.

Data import

The site in the URL is scraped to obtain the neighborhood data, from the zip codes

In [75]:
import requests

myurl = 'http://www.laalmanac.com/communications/cm02a90001-90899.php'
header = {
  "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36",
  "X-Requested-With": "XMLHttpRequest"
}
r = requests.get(myurl, headers=header)
df1 = pd.read_html(r.text)
LA_zip_community_lookup = df1[0]
In [92]:
LA_zip_community_lookup.rename(columns={'Zip Code':'ZIP_CODE'}, inplace=True)
LA_zip_community_lookup.rename(columns={'Cities/Communities':'NEIGHBORHOOD'}, inplace=True)
LA_zip_community_lookup.head()
Out[92]:
ZIP_CODE NEIGHBORHOOD
0 90001 Los Angeles (South Los Angeles), Florence-Graham
1 90002 Los Angeles (Southeast Los Angeles, Watts)
2 90003 Los Angeles (South Los Angeles, Southeast Los ...
3 90004 Los Angeles (Hancock Park, Rampart Village, Vi...
4 90005 Los Angeles (Hancock Park, Koreatown, Wilshire...

The csv file obtained from the link mentioned in the report contains a list of all active businesses in the LA area. This data is being read here for analysis.

In [76]:
dfs = pd.read_csv('listing-of-active-businesses.csv')
dfs.head()
Out[76]:
LOCATION ACCOUNT # BUSINESS NAME DBA NAME STREET ADDRESS CITY ZIP CODE LOCATION DESCRIPTION MAILING ADDRESS MAILING CITY MAILING ZIP CODE NAICS PRIMARY NAICS DESCRIPTION COUNCIL DISTRICT LOCATION START DATE LOCATION END DATE LOCATION Zip Codes Council Districts Census Tracts Precinct Boundaries LA Specific Plans Neighborhood Councils (Certified)
0 0000000108-0001-3 PALACE OF VENICE GUEST HOME /C NaN 1727 CRENSHAW BLVD LOS ANGELES 90019-6037 1727 CRENSHAW 90019-6037 NaN NaN NaN 721310.0 Rooming & boarding houses 10.0 1991-05-15T00:00:00.000 NaN {'latitude': '34.0425', 'longitude': '-118.3295'} 23080.0 12.0 648.0 1105.0 NaN 19.0
1 0000000115-0001-3 VINCENZO LABELLA NaN 521 SWARTHMORE AVENUE PACIFIC PALISADES 90272-4350 521 SWARTHMORE 90272-4350 521 SWARTHMORE AVENUE PACIFIC PALISADES 90272-4350 561500.0 Travel arrangement & reservation services 11.0 1990-01-01T00:00:00.000 NaN NaN NaN NaN NaN NaN NaN NaN
2 0000000121-0001-9 WILCARE ECONOMIC DEVELOPMENT CORPORATION NaN 9911 AVALON BLVD LOS ANGELES 90003-4805 9911 AVALON 90003-4805 448 E 99TH STREET LOS ANGELES 90003-4804 721310.0 Rooming & boarding houses 8.0 1999-01-01T00:00:00.000 NaN {'latitude': '33.9463', 'longitude': '-118.2651'} 22351.0 14.0 806.0 1176.0 7.0 45.0
3 0000000132-0001-7 CARLOS ANGEL NaN 1221 W 7TH STREET SUITE #N-407 LOS ANGELES 90017-2689 1221 7TH 90017-2689 NaN NaN NaN 561300.0 Employment services 1.0 1999-07-01T00:00:00.000 NaN {'latitude': '34.0518', 'longitude': '-118.2665'} 23078.0 11.0 564.0 1378.0 54.0 76.0
4 0000000133-0001-1 A A OFICINA CENTRAL HISPANA DE LOS ANGELES /C NaN 4917 S BROADWAY LOS ANGELES 90037-3211 4917 BROADWAY 90037-3211 2607 VAN BUREN PLACE LOS ANGELES 90007-2129 611000.0 Educational services (including schools, colle... 9.0 1991-01-01T00:00:00.000 NaN {'latitude': '33.9981', 'longitude': '-118.2783'} 23668.0 13.0 737.0 655.0 7.0 NaN

Cleaning up

The imported data set is being cleaned up by discarding columns that are not useful for this analysis, removing rows with partial/no information.

In [77]:
losangeles_businesses = dfs.drop(['LOCATION ACCOUNT #', 'DBA NAME', 'LOCATION DESCRIPTION', 'MAILING ADDRESS', 'MAILING ZIP CODE', 'MAILING CITY', 'NAICS', 'COUNCIL DISTRICT', 'LOCATION START DATE', 'LOCATION END DATE', 'Zip Codes', 'Council Districts', 'Census Tracts', 'Precinct Boundaries', 'LA Specific Plans', 'Neighborhood Councils (Certified)'], axis=1)
losangeles_businesses.head()
Out[77]:
BUSINESS NAME STREET ADDRESS CITY ZIP CODE PRIMARY NAICS DESCRIPTION LOCATION
0 PALACE OF VENICE GUEST HOME /C 1727 CRENSHAW BLVD LOS ANGELES 90019-6037 Rooming & boarding houses {'latitude': '34.0425', 'longitude': '-118.3295'}
1 VINCENZO LABELLA 521 SWARTHMORE AVENUE PACIFIC PALISADES 90272-4350 Travel arrangement & reservation services NaN
2 WILCARE ECONOMIC DEVELOPMENT CORPORATION 9911 AVALON BLVD LOS ANGELES 90003-4805 Rooming & boarding houses {'latitude': '33.9463', 'longitude': '-118.2651'}
3 CARLOS ANGEL 1221 W 7TH STREET SUITE #N-407 LOS ANGELES 90017-2689 Employment services {'latitude': '34.0518', 'longitude': '-118.2665'}
4 A A OFICINA CENTRAL HISPANA DE LOS ANGELES /C 4917 S BROADWAY LOS ANGELES 90037-3211 Educational services (including schools, colle... {'latitude': '33.9981', 'longitude': '-118.2783'}
In [78]:
len(losangeles_businesses)
Out[78]:
10715
In [79]:
losangeles_businesses.dropna(inplace=True)
losangeles_businesses.reset_index(drop=True, inplace=True)
len(losangeles_businesses)
Out[79]:
9853
In [80]:
losangeles_businesses.head()
Out[80]:
BUSINESS NAME STREET ADDRESS CITY ZIP CODE PRIMARY NAICS DESCRIPTION LOCATION
0 PALACE OF VENICE GUEST HOME /C 1727 CRENSHAW BLVD LOS ANGELES 90019-6037 Rooming & boarding houses {'latitude': '34.0425', 'longitude': '-118.3295'}
1 WILCARE ECONOMIC DEVELOPMENT CORPORATION 9911 AVALON BLVD LOS ANGELES 90003-4805 Rooming & boarding houses {'latitude': '33.9463', 'longitude': '-118.2651'}
2 CARLOS ANGEL 1221 W 7TH STREET SUITE #N-407 LOS ANGELES 90017-2689 Employment services {'latitude': '34.0518', 'longitude': '-118.2665'}
3 A A OFICINA CENTRAL HISPANA DE LOS ANGELES /C 4917 S BROADWAY LOS ANGELES 90037-3211 Educational services (including schools, colle... {'latitude': '33.9981', 'longitude': '-118.2783'}
4 A A OFICINA CENTRAL HISPANA DE LOS ANGELES /C 1330 WILSHIRE BLVD #208 LOS ANGELES 90017-1705 Educational services (including schools, colle... {'latitude': '34.0543', 'longitude': '-118.2678'}
In [81]:
temp = losangeles_businesses["ZIP CODE"].str.split("-", n = 1, expand = True)
losangeles_businesses["ZIP CODE"] = temp[0]

losangeles_businesses.head()
Out[81]:
BUSINESS NAME STREET ADDRESS CITY ZIP CODE PRIMARY NAICS DESCRIPTION LOCATION
0 PALACE OF VENICE GUEST HOME /C 1727 CRENSHAW BLVD LOS ANGELES 90019 Rooming & boarding houses {'latitude': '34.0425', 'longitude': '-118.3295'}
1 WILCARE ECONOMIC DEVELOPMENT CORPORATION 9911 AVALON BLVD LOS ANGELES 90003 Rooming & boarding houses {'latitude': '33.9463', 'longitude': '-118.2651'}
2 CARLOS ANGEL 1221 W 7TH STREET SUITE #N-407 LOS ANGELES 90017 Employment services {'latitude': '34.0518', 'longitude': '-118.2665'}
3 A A OFICINA CENTRAL HISPANA DE LOS ANGELES /C 4917 S BROADWAY LOS ANGELES 90037 Educational services (including schools, colle... {'latitude': '33.9981', 'longitude': '-118.2783'}
4 A A OFICINA CENTRAL HISPANA DE LOS ANGELES /C 1330 WILSHIRE BLVD #208 LOS ANGELES 90017 Educational services (including schools, colle... {'latitude': '34.0543', 'longitude': '-118.2678'}

Some string manipulation operations are being performed here so that the malformed JSON in the LOCATION column can be interpreted correctly as LATITUDE and LONGITUDE

In [83]:
losangeles_businesses["LOCATION"] = losangeles_businesses["LOCATION"].str.replace("'", "")
losangeles_businesses["LOCATION"] = losangeles_businesses["LOCATION"].str.replace(",", "")
losangeles_businesses["LOCATION"] = losangeles_businesses["LOCATION"].str.replace("}", "")
temp_lat1 = losangeles_businesses["LOCATION"].str.split(" ", n = 1, expand = True)[1]
temp_lat2 = temp_lat1.str.split(" ", n = 1, expand = True)[0]

temp_long1 = losangeles_businesses["LOCATION"].str.split(" ", n = 1, expand = True)[1]
temp_long2 = temp_lat1.str.split(":", n = 1, expand = True)[1]
temp_long2 = temp_long2.str.replace(" ", "")
temp_long2 = temp_long2.str.split("human_address", n = 1, expand = True)[0]
In [84]:
losangeles_businesses["LATITUDE"] = temp_lat2.to_frame()
losangeles_businesses["LONGITUDE"] = temp_long2.to_frame()
losangeles_businesses.drop(columns=["LOCATION"],inplace=True)
losangeles_businesses.head()
len(losangeles_businesses)
Out[84]:
9853

The dataset seems to contain a large amount of data, which slows down processing. Hence, for the purposes of this analysis, only the locations around the city of LA is considered. The neighboring suburbs are not studied here and are thus discarded.

In [85]:
losangeles_businesses_only = losangeles_businesses[losangeles_businesses['CITY'].str.contains("LOS ANGELES", case=False, na=False)].reset_index(drop=True)
losangeles_businesses_only.rename(columns={'ZIP CODE':'ZIP_CODE'}, inplace=True)
len(losangeles_businesses_only)
Out[85]:
4494
In [86]:
losangeles_businesses_only.head()
Out[86]:
BUSINESS NAME STREET ADDRESS CITY ZIP_CODE PRIMARY NAICS DESCRIPTION LATITUDE LONGITUDE
0 PALACE OF VENICE GUEST HOME /C 1727 CRENSHAW BLVD LOS ANGELES 90019 Rooming & boarding houses 34.0425 -118.3295
1 WILCARE ECONOMIC DEVELOPMENT CORPORATION 9911 AVALON BLVD LOS ANGELES 90003 Rooming & boarding houses 33.9463 -118.2651
2 CARLOS ANGEL 1221 W 7TH STREET SUITE #N-407 LOS ANGELES 90017 Employment services 34.0518 -118.2665
3 A A OFICINA CENTRAL HISPANA DE LOS ANGELES /C 4917 S BROADWAY LOS ANGELES 90037 Educational services (including schools, colle... 33.9981 -118.2783
4 A A OFICINA CENTRAL HISPANA DE LOS ANGELES /C 1330 WILSHIRE BLVD #208 LOS ANGELES 90017 Educational services (including schools, colle... 34.0543 -118.2678
In [90]:
losangeles_businesses_only['LATITUDE'] = losangeles_businesses_only['LATITUDE'].astype(float)
losangeles_businesses_only['LONGITUDE'] = losangeles_businesses_only['LONGITUDE'].astype(float)

A master dataset is created here with neighborhood names included by merging the neighborhoods dataset along with the LA active businesses dataset

In [93]:
losangeles_businesses_only.replace('', np.nan, inplace=True)
losangeles_businesses_only.dropna(inplace=True)
losangeles_businesses_only['ZIP_CODE'] = losangeles_businesses_only['ZIP_CODE'].astype(int)
LA_zip_community_lookup['ZIP_CODE'] = LA_zip_community_lookup['ZIP_CODE'].astype(int)
la_businesses_master = pd.merge(losangeles_businesses_only, LA_zip_community_lookup[['NEIGHBORHOOD','ZIP_CODE']], on='ZIP_CODE')
la_businesses_master.head()
Out[93]:
BUSINESS NAME STREET ADDRESS CITY ZIP_CODE PRIMARY NAICS DESCRIPTION LATITUDE LONGITUDE NEIGHBORHOOD
0 PALACE OF VENICE GUEST HOME /C 1727 CRENSHAW BLVD LOS ANGELES 90019 Rooming & boarding houses 34.0425 -118.3295 Los Angeles (Arlington Heights, Country Club P...
1 PUBLIC STORAGE INC 5570 AIRDROME STREET LOS ANGELES 90019 Lessors of real estate (including mini warehou... 34.0444 -118.3621 Los Angeles (Arlington Heights, Country Club P...
2 A A OFICINA CENTRAL HISPANA DE LOS ANGELES /C 4473 W PICO BLVD LOS ANGELES 90019 Educational services (including schools, colle... 34.0483 -118.3329 Los Angeles (Arlington Heights, Country Club P...
3 MATTHEW K MARCY 1278 QUEEN ANNE PLACE LOS ANGELES 90019 Lessors of real estate (including mini warehou... 34.0484 -118.3326 Los Angeles (Arlington Heights, Country Club P...
4 SPECIAL SERVICE FOR GROUPS 1310 S ST ANDREWS PLACE LOS ANGELES 90019 Individual & family services 34.0470 -118.3116 Los Angeles (Arlington Heights, Country Club P...

Obtain dataset of competing Cuban restaurants across LA

The foursquare API is used here to obtain the competing Cuban restaurants

In [94]:
search_query = 'Cuban'
radius = 25000
print(search_query + ' .... OK!')

address = 'Los Angeles, CA'

geolocator = Nominatim(user_agent="ca_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Los Angeles are {}, {}.'.format(latitude, longitude))
Cuban .... OK!
The geograpical coordinate of Los Angeles are 34.0536909, -118.2427666.
In [95]:
CLIENT_ID = 'JBY0YANCCPVYDZJRGUC4PKUJXPXRSCB52IZYBIN3VV4BH3OQ' # your Foursquare ID
CLIENT_SECRET = 'MV4MQUCRVYAIOPQTHU2EGDWRV4SSLTQPHAZUR5LQOC5C1QWX' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 75
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)
Your credentails:
CLIENT_ID: JBY0YANCCPVYDZJRGUC4PKUJXPXRSCB52IZYBIN3VV4BH3OQ
CLIENT_SECRET:MV4MQUCRVYAIOPQTHU2EGDWRV4SSLTQPHAZUR5LQOC5C1QWX
In [96]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
url
Out[96]:
'https://api.foursquare.com/v2/venues/search?client_id=JBY0YANCCPVYDZJRGUC4PKUJXPXRSCB52IZYBIN3VV4BH3OQ&client_secret=MV4MQUCRVYAIOPQTHU2EGDWRV4SSLTQPHAZUR5LQOC5C1QWX&ll=34.0536909,-118.2427666&v=20180604&query=Cuban&radius=25000&limit=75'
In [97]:
results = requests.get(url).json()
#results
In [98]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = pd.json_normalize(venues)
dataframe.head()
Out[98]:
id name categories referralId hasPerk location.address location.crossStreet location.lat location.lng location.labeledLatLngs location.distance location.postalCode location.cc location.city location.state location.country location.formattedAddress delivery.id delivery.url delivery.provider.name delivery.provider.icon.prefix delivery.provider.icon.sizes delivery.provider.icon.name venuePage.id
0 49dbc8eff964a520055f1fe3 Versailles Cuban Food [{'id': '4bf58dd8d48988d154941735', 'name': 'C... v-1592966273 False 10319 Venice Blvd at Motor Ave. 34.021052 -118.403515 [{'label': 'display', 'lat': 34.02105161560661... 15267 90034 US Los Angeles CA United States [10319 Venice Blvd (at Motor Ave.), Los Angele... 1004827 https://www.grubhub.com/restaurant/versailles-... grubhub https://fastly.4sqi.net/img/general/cap/ [40, 50] /delivery_provider_grubhub_20180129.png NaN
1 4ce779910f196dcb57e03eae Cuban Seed Cigar Co. [{'id': '4bf58dd8d48988d123951735', 'name': 'S... v-1592966273 False 8851 W Sunset Blvd Across from The Viper Room 34.090756 -118.384714 [{'label': 'display', 'lat': 34.09075595938065... 13723 90069 US West Hollywood CA United States [8851 W Sunset Blvd (Across from The Viper Roo... NaN NaN NaN NaN NaN NaN 81983882
2 4b4544e5f964a520840926e3 Tropicana Bakery & Cuban Cafe [{'id': '4bf58dd8d48988d16a941735', 'name': 'B... v-1592966273 False 10218 Paramount Blvd at Florence Ave 33.952925 -118.130315 [{'label': 'display', 'lat': 33.95292490827333... 15281 90241 US Downey CA United States [10218 Paramount Blvd (at Florence Ave), Downe... NaN NaN NaN NaN NaN NaN NaN
3 5b39450a031320002cea9627 Vegan Cuban Food [{'id': '56aa371be4b08b9a8d57350b', 'name': 'F... v-1592966273 False NaN NaN 34.035336 -118.241896 [{'label': 'display', 'lat': 34.035336, 'lng':... 2044 90021 US Los Angeles CA United States [Los Angeles, CA 90021, United States] NaN NaN NaN NaN NaN NaN NaN
4 5de82f8e0fd8370008be3f2d Equelecua Cuban Vegan Cafe [{'id': '4bf58dd8d48988d154941735', 'name': 'C... v-1592966273 False 55 S Madison Ave NaN 34.144821 -118.138779 [{'label': 'display', 'lat': 34.14482116699219... 13956 91101 US Pasadena CA United States [55 S Madison Ave, Pasadena, CA 91101, United ... 1703791 https://www.grubhub.com/restaurant/equelecu-cu... grubhub https://fastly.4sqi.net/img/general/cap/ [40, 50] /delivery_provider_grubhub_20180129.png NaN

Cleaning up the JSON data obtained from foursquare

In [99]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

dataframe_filtered.head()
Out[99]:
name categories address crossStreet lat lng labeledLatLngs distance postalCode cc city state country formattedAddress id
0 Versailles Cuban Food Cuban Restaurant 10319 Venice Blvd at Motor Ave. 34.021052 -118.403515 [{'label': 'display', 'lat': 34.02105161560661... 15267 90034 US Los Angeles CA United States [10319 Venice Blvd (at Motor Ave.), Los Angele... 49dbc8eff964a520055f1fe3
1 Cuban Seed Cigar Co. Smoke Shop 8851 W Sunset Blvd Across from The Viper Room 34.090756 -118.384714 [{'label': 'display', 'lat': 34.09075595938065... 13723 90069 US West Hollywood CA United States [8851 W Sunset Blvd (Across from The Viper Roo... 4ce779910f196dcb57e03eae
2 Tropicana Bakery & Cuban Cafe Bakery 10218 Paramount Blvd at Florence Ave 33.952925 -118.130315 [{'label': 'display', 'lat': 33.95292490827333... 15281 90241 US Downey CA United States [10218 Paramount Blvd (at Florence Ave), Downe... 4b4544e5f964a520840926e3
3 Vegan Cuban Food Food Stand NaN NaN 34.035336 -118.241896 [{'label': 'display', 'lat': 34.035336, 'lng':... 2044 90021 US Los Angeles CA United States [Los Angeles, CA 90021, United States] 5b39450a031320002cea9627
4 Equelecua Cuban Vegan Cafe Cuban Restaurant 55 S Madison Ave NaN 34.144821 -118.138779 [{'label': 'display', 'lat': 34.14482116699219... 13956 91101 US Pasadena CA United States [55 S Madison Ave, Pasadena, CA 91101, United ... 5de82f8e0fd8370008be3f2d

Since the search keyword used on Foursquare was 'Cuban', the results contained Smoke Shops and Convention centers as well. Hence the data is being filtered for places that just serve food

In [100]:
dataframe_filtered = dataframe_filtered[dataframe_filtered['categories'].notna()]
df_cleaned = dataframe_filtered[dataframe_filtered['categories'].str.contains('restaurant|food|bar|caf', case=False)]
In [101]:
cuban_competitors = df_cleaned.reset_index(drop=True)
cuban_competitors = cuban_competitors[['name','lat','lng']]
cuban_competitors
Out[101]:
name lat lng
0 Versailles Cuban Food 34.021052 -118.403515
1 Vegan Cuban Food 34.035336 -118.241896
2 Equelecua Cuban Vegan Cafe 34.144821 -118.138779
3 Crispy Cuban Express 34.044597 -118.272269
4 Crispy Cuban Express 34.044296 -118.272611
5 No Jodas Cuban Kitchen 33.888439 -118.383581
6 Guantamera Fine Cuban Cuisine 34.183930 -118.322314
7 Mayumba Cuban 34.072974 -118.070880
8 Cuban Bistro-Glendale 34.146912 -118.253572
9 Baracoa Cuban Cafe 34.117513 -118.261600
10 Cuban/Puerto Rican - Cafe Atlantic 34.147018 -118.149200
11 Las Villas Restaurant Cuban Cuisine 33.967685 -118.204236
12 Gigi's Bakery & Cafe 34.070700 -118.269436
13 El Floridita Cuban Restaurant 34.094133 -118.327038
14 Brother's Grill Cuban & Argentinian 33.944416 -118.206200
15 Juana la Cubana 34.030892 -118.267003
16 Lili's Express Cuban Food 33.992286 -118.113700
17 The Lost Cuban Kitchen 34.144810 -118.138519
18 Versailles 34.052582 -118.376334
19 Florida Cuban Restaurant 33.948355 -118.117090
20 Zoe's Cuban Cuisine 34.156315 -118.136535
21 Equelecuá Cuban Café 33.960950 -118.375925
22 La Cubana 34.137050 -118.251503
23 Cuban Cafe & Bakery 33.830433 -118.290131
24 Las Palmas Spanish and Cuban Cuisine 34.187008 -118.386833
25 Lili's Cuban Buffet 34.162060 -118.068237
26 Varadero Cuban Cafe 33.902363 -118.361580
27 Mambo's Cafe 34.161368 -118.300017
28 La Cubana 34.147739 -118.254035
29 Habana Vieja Cuban Cuisine & Cafe 33.831094 -118.308272
30 Florida Cuban Restaurant 33.935333 -118.038669
31 Los Amigos Bar & Grill 34.158322 -118.332773
32 Mexi Cuban Catering 34.044300 -117.986200
33 El Criollo Bar & Grill 34.183716 -118.322553
34 El Tumbao -Cuban Cuisine 34.041901 -117.948441

All active LA businesses are being visualized in blue on a city map of LA

In [102]:
map_la_businesses = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, name in zip(la_businesses_master['LATITUDE'], la_businesses_master['LONGITUDE'], la_businesses_master['BUSINESS NAME']):
    label = '{}'.format(name)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=2,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.8,
        parse_html=False).add_to(map_la_businesses)  
    
map_la_businesses
Out[102]:
Make this Notebook Trusted to load map: File -> Trust Notebook

A flag is added to check if each location is a competitor or not

In [25]:
la_businesses_master['close_to_competitor'] = False

In this part, businesses within a 3km radius from each Cuban Restaurant is flagged as 'Close to Competitor'

In [26]:
from geopy import distance

for clat, clng, clabel in zip(cuban_competitors['lat'], cuban_competitors['lng'], cuban_competitors['name']):
    for la_index, la_lat, la_lng, la_street_address, la_business_name in zip(la_businesses_master.index, la_businesses_master['LATITUDE'], la_businesses_master['LONGITUDE'], la_businesses_master['STREET ADDRESS'], la_businesses_master['BUSINESS NAME']):
        
        center_point = [{'lat': clat, 'lng': clng}]
        test_point = [{'lat': la_lat, 'lng': la_lng}]
        radius = 3 # in kilometer
        
        center_point_tuple = tuple(center_point[0].values()) # (-7.7940023, 110.3656535)
        test_point_tuple = tuple(test_point[0].values()) # (-7.79457, 110.36563)
        dis = distance.distance(center_point_tuple, test_point_tuple).km
        #print(la_index)
        
        if dis < radius:
            la_businesses_master.at[la_index, 'close_to_competitor'] = True
        
        
print('Completed!')
Completed!
In [27]:
la_businesses_master.head()
Out[27]:
BUSINESS NAME STREET ADDRESS CITY ZIP_CODE PRIMARY NAICS DESCRIPTION LATITUDE LONGITUDE NEIGHBORHOOD close_to_competitor
0 PALACE OF VENICE GUEST HOME /C 1727 CRENSHAW BLVD LOS ANGELES 90019 Rooming & boarding houses 34.0425 -118.3295 Los Angeles (Arlington Heights, Country Club P... False
1 PUBLIC STORAGE INC 5570 AIRDROME STREET LOS ANGELES 90019 Lessors of real estate (including mini warehou... 34.0444 -118.3621 Los Angeles (Arlington Heights, Country Club P... True
2 A A OFICINA CENTRAL HISPANA DE LOS ANGELES /C 4473 W PICO BLVD LOS ANGELES 90019 Educational services (including schools, colle... 34.0483 -118.3329 Los Angeles (Arlington Heights, Country Club P... False
3 MATTHEW K MARCY 1278 QUEEN ANNE PLACE LOS ANGELES 90019 Lessors of real estate (including mini warehou... 34.0484 -118.3326 Los Angeles (Arlington Heights, Country Club P... False
4 SPECIAL SERVICE FOR GROUPS 1310 S ST ANDREWS PLACE LOS ANGELES 90019 Individual & family services 34.0470 -118.3116 Los Angeles (Arlington Heights, Country Club P... False

Locations labeled as close to 'competitor' is being filtered out

In [28]:
la_businesses_master.groupby('close_to_competitor').count()
Out[28]:
BUSINESS NAME STREET ADDRESS CITY ZIP_CODE PRIMARY NAICS DESCRIPTION LATITUDE LONGITUDE NEIGHBORHOOD
close_to_competitor
False 2130 2130 2130 2130 2130 2130 2130 2130
True 2361 2361 2361 2361 2361 2361 2361 2361
In [29]:
la_businesses_master = la_businesses_master[la_businesses_master['close_to_competitor'] == False] 
la_businesses_master.shape
Out[29]:
(2130, 9)

Estimating number of businesses per neighborhood

In [30]:
la_businesses_master.groupby('NEIGHBORHOOD').count()
Out[30]:
BUSINESS NAME STREET ADDRESS CITY ZIP_CODE PRIMARY NAICS DESCRIPTION LATITUDE LONGITUDE close_to_competitor
NEIGHBORHOOD
Athens, Los Angeles (South Los Angeles) 57 57 57 57 57 57 57 57
City Terrace, Los Angeles (Boyle Heights) 14 14 14 14 14 14 14 14
Commerce, City of 1 1 1 1 1 1 1 1
Commerce, East Los Angeles 29 29 29 29 29 29 29 29
Culver City, Los Angeles (Mar Vista) 66 66 66 66 66 66 66 66
East Los Angeles 12 12 12 12 12 12 12 12
Ladera Heights 4 4 4 4 4 4 4 4
Los Angeles (Arlington Heights, Country Club Park, Mid-City) 53 53 53 53 53 53 53 53
Los Angeles (Baldwin Hills, Crenshaw, Leimert Park) 65 65 65 65 65 65 65 65
Los Angeles (Bel Air Estates, Beverly Glen) 15 15 15 15 15 15 15 15
Los Angeles (Bel Air Estates, Brentwood) 106 106 106 106 106 106 106 106
Los Angeles (Boyle Heights) 37 37 37 37 37 37 37 37
Los Angeles (Byzantine-Latino Quarter, Harvard Heights, Koreatown, Pico Heights) 23 23 23 23 23 23 23 23
Los Angeles (Century City) 79 79 79 79 79 79 79 79
Los Angeles (Cheviot Hills, Rancho Park) 114 114 114 114 114 114 114 114
Los Angeles (Cypress Park, Glassell Park, Mt Washington) 34 34 34 34 34 34 34 34
Los Angeles (Downtown Bunker Hill, City West, Historic Core, South Park-North) 1 1 1 1 1 1 1 1
Los Angeles (Downtown Central, Downtown Fashion District) 6 6 6 6 6 6 6 6
Los Angeles (Downtown Civic Center, Chinatown, Arts District, Bunker Hill, Historic Core, Little Tokyo) 23 23 23 23 23 23 23 23
Los Angeles (Downtown Historic Core, Arts District) 1 1 1 1 1 1 1 1
Los Angeles (Dowtown Fashion District, South Park-South) 1 1 1 1 1 1 1 1
Los Angeles (Eagle Rock) 47 47 47 47 47 47 47 47
Los Angeles (East Hollywood) 13 13 13 13 13 13 13 13
Los Angeles (Echo Park, Silver Lake) 3 3 3 3 3 3 3 3
Los Angeles (El Sereno, Monterey Hills, University Hills) 47 47 47 47 47 47 47 47
Los Angeles (Fairfax, Melrose, Miracle Mile, Park La Brea, Wilshire-La Brea) 52 52 52 52 52 52 52 52
Los Angeles (Griffith Park, Hollywood, Los Feliz) 29 29 29 29 29 29 29 29
Los Angeles (Hancock Park, Koreatown, Wilshire Center, Wilshire Park, Windsor Square) 25 25 25 25 25 25 25 25
Los Angeles (Hancock Park, Rampart Village, Virgil Village, Wilshire Center, Windsor Square) 7 7 7 7 7 7 7 7
Los Angeles (Hancock Park, Western Wilton, Wilshire Center, Windsor Square) 27 27 27 27 27 27 27 27
Los Angeles (Hancock Park, Wilshire Center, Windsor Square) 66 66 66 66 66 66 66 66
Los Angeles (Highland Park) 71 71 71 71 71 71 71 71
Los Angeles (Hollywood) 22 22 22 22 22 22 22 22
Los Angeles (Hollywood), West Hollywood 36 36 36 36 36 36 36 36
Los Angeles (Hollywood, Melrose), West Hollywood 30 30 30 30 30 30 30 30
Los Angeles (Hyde Park, View Park, Windsor Hills) 62 62 62 62 62 62 62 62
Los Angeles (Jefferson Park, Leimert Park) 70 70 70 70 70 70 70 70
Los Angeles (Lincoln Heights, Montecito Heights) 44 44 44 44 44 44 44 44
Los Angeles (Los Angeles International Airport, Westchester) 20 20 20 20 20 20 20 20
Los Angeles (Mid-City West), West Hollywood 7 7 7 7 7 7 7 7
Los Angeles (Palms) 3 3 3 3 3 3 3 3
Los Angeles (Sawtelle, West Los Angeles) 234 234 234 234 234 234 234 234
Los Angeles (South Los Angeles) 146 146 146 146 146 146 146 146
Los Angeles (South Los Angeles), Florence-Graham 17 17 17 17 17 17 17 17
Los Angeles (South Los Angeles, Southeast Los Angeles) 64 64 64 64 64 64 64 64
Los Angeles (Southeast Los Angeles) 30 30 30 30 30 30 30 30
Los Angeles (Southeast Los Angeles), Vernon 7 7 7 7 7 7 7 7
Los Angeles (Southeast Los Angeles, Univerity Park) 4 4 4 4 4 4 4 4
Los Angeles (Southeast Los Angeles, Watts) 17 17 17 17 17 17 17 17
Los Angeles (Southeast Los Angeles, Watts), Willowbrook 26 26 26 26 26 26 26 26
Los Angeles (University of California Los Angeles) 1 1 1 1 1 1 1 1
Los Angeles (West Adams) 35 35 35 35 35 35 35 35
Los Angeles (West Fairfax) 2 2 2 2 2 2 2 2
Los Angeles (Westwood) 124 124 124 124 124 124 124 124
Torrance 1 1 1 1 1 1 1 1
In [31]:
print('There are {} uniques categories.'.format(len(la_businesses_master['PRIMARY NAICS DESCRIPTION'].unique())))
There are 180 uniques categories.
In [ ]:
 

Identifying the top 10 popular businesses per neighborhood, to get an idea of the economic background of the neighborhood

In [32]:
# one hot encoding
la_onehot = pd.get_dummies(la_businesses_master[['PRIMARY NAICS DESCRIPTION']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
la_onehot['Neighborhood'] = la_businesses_master['NEIGHBORHOOD'] 

# move neighborhood column to the first column
fixed_columns = [la_onehot.columns[-1]] + list(la_onehot.columns[:-1])
la_onehot = la_onehot[fixed_columns]

la_onehot.head()
Out[32]:
Neighborhood Activities related to credit intermediation (including loan brokers) Advertising & related services Agents & managers for artists, athletes, entertainers, & other public figures All other miscellaneous store retailers (including tobacco, candle, & trophy shops) All other personal services All other professional, scientific, & technical services All other specialty trade contractors Apparel mfg. Apparel, piece goods, & notions Architectural services Automotive body, paint, interior, & glass repair Automotive equipment rental & leasing Automotive mechanical & electrical repair & maintenance Automotive parts, accessories, & tire stores Bakeries & tortilla mfg. Barber shops Beer, wine & liquor stores Beer, wine, & distilled alcoholic beverage Book stores Books, periodicals, & newspapers Business service centers (including private mail centers & copy shops) Carpentry Contractors (1997 NAICS) Carpet & upholstery cleaning services Chemical & allied products Child day care services Children's & infants' clothing stores Clothing accessories stores Coin-operated laundries & drycleaners Commercial & industrial machinery & equipment (except automotive & electronic) repair & maintenance Commercial & industrial machinery & equipment rental & leasing Community food & housing, & emergency & other relief services Computer & electronic product mfg. Computer & software stores Computer systems design & related services Consumer electronics & appliances rental Cosmetics, beauty supplies, & perfume stores Couriers & messengers Data processing, hosting, & related services Drinking places (alcoholic beverages) Drugs & druggists' sundries Drycleaning & laundry services (except coin-operated) (including laundry & drycleaning drop-off & pickup sites) Drywall, Plastering, Acoustical, and Insulation Contractors (1997 NAICS) Educational services (including schools, colleges, & universities) Electrical & electronic goods Electrical Contractors (1997 NAICS) Electronic shopping Employment services Engineering services Exterminating & pest control services Fabricated metal product mfg. Facilities support (management) services Family clothing stores Floor Laying and Other Floor Contractors (1997 NAICS) Florists Flower, nursery stock, & florists' supplies Footwear & leather goods repair Fruit & vegetable markets Full-service restaurants Furniture & home furnishing Furniture & related product mfg. Furniture stores Gasoline stations (including convenience stores with gas) General freight trucking, local General merchandise stores Gift, novelty, & souvenir stores Glass and Glazing Contractors (1997 NAICS) Grocery & related products Grocery stores (including supermarkets & convenience stores without gas) Hardware stores Hardware, & plumbing & heating equipment & supplies Home furnishings stores Hospitals Household appliance stores Independent artists, writers, & performers Individual & family services Insurance agencies & brokerages Internet publishing & broadcasting Internet service providers Investigation & security services Janitorial services Jewelry stores Jewelry, watch, precious stone, & precious metals Landscape architecture services Landscaping services Legal services Lessors of real estate (including mini warehouses & self-storage units) Limited-service eating places Lumber & other construction materials Machinery, equipment, & supplies Management, scientific, & technical consulting services Manufacturing and Industrial Building Construction (1997 NAICS) Medical & diagnostic laboratories Metal & mineral (except petroleum) Motion picture & video industries (except video rental) Motor vehicle & motor vehicle parts & supplies Multifamily Housing Construction (1997 NAICS) Musical instrument & supplies stores Nail salons New car dealers Nondepository credit intermediation (including sales financing & consumer lending) Nursing & residential care facilities Office administrative services Office supplies & stationery stores Offices of all other miscellaneous health practitioners Offices of certified public accountants Offices of chiropractors Offices of dentists Offices of mental health practitioners (except physicians) Offices of optometrists Offices of physical, occupational & speech therapists, & audiologists Offices of physicians (except mental health specialists) Offices of podiatrists Offices of real estate agents & brokers Optical goods stores Other Clothing Stores Other accounting services Other ambulatory health care services (including ambulance services, blood, & organ banks) Other amusement & recreation services (including golf courses, skiing facilities, marinas, fitness centers, bowling centers, skating rinks, miniature golf courses) Other automotive repair & maintenance (including oil change & lubrication shops & car washes) Other building equipment contractors Other building materials dealers Other business support services (including repossession services, court reporting, & stenotype services) Other consumer goods rental Other direct selling establishments (including door-to-door retailing, frozen food plan providers, party plan merchandisers, & coffee-break service providers) Other financial investment activities (including investment advice) Other food mfg. (including coffee, tea, flavoring, & seasonings) Other health & personal care stores Other insurance related activities Other miscellaneous durable goods Other miscellaneous nondurable goods Other personal & household goods repair & maintenance Other personal care services (including diet & weight reducing centers) Outpatient care centers Paint & wallpaper stores Paint, varnish, & supplies Painting and Wall Covering Contractors (1997 NAICS) Paper & paper products Paper mfg. Parking lots & garages Pet care (except veterinary) services Pharmacies & drug stores Photographic services Plumbing, Heating, and Air-Conditioning Contractors (1997 NAICS) Poured concrete foundation & structure contractors Professional & commercial equipment & supplies Promoters of performing arts, sports, & similar events Radio, television, & other electronics stores Real estate property managers Recyclable materials Reupholstery & furniture repair Roofing, Siding, and Sheet Metal Contractors (1997 NAICS) Rooming & boarding houses Scientific research & development services Securities brokers Shoe stores Single Family Housing Construction (1997 NAICS) Sound recording industries Special food services (including food service contractors & caterers) Specialized design services (including interior, industrial, graphic, & fashion design) Specialized freight trucking (including household moving vans) Sporting goods stores Structural Steel Erection Contractors (1997 NAICS) Support activities for animal production (including farriers) Support activities for transportation (including motor vehicle towing) Tax preparation services Textile product mills Tile & terrazzo contractors Toy & hobby goods & supplies Translation & interpretation services Travel arrangement & reservation services Traveler accommodation (including hotels, motels, & bed & breakfast inns) Unclassified establishments (unable to classify) Used car dealers Used merchandise stores Vending machine operators Veterinary services Video tape & disc rental Warehousing & storage (except leases of mini warehouses & self-storage units) Waste management & remediation services Women's clothing stores
0 Los Angeles (Arlington Heights, Country Club P... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 Los Angeles (Arlington Heights, Country Club P... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 Los Angeles (Arlington Heights, Country Club P... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4 Los Angeles (Arlington Heights, Country Club P... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
5 Los Angeles (Arlington Heights, Country Club P... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
In [33]:
la_onehot.shape
Out[33]:
(2130, 181)
In [34]:
la_grouped = la_onehot.groupby('Neighborhood').mean().reset_index()
la_grouped.head()
Out[34]:
Neighborhood Activities related to credit intermediation (including loan brokers) Advertising & related services Agents & managers for artists, athletes, entertainers, & other public figures All other miscellaneous store retailers (including tobacco, candle, & trophy shops) All other personal services All other professional, scientific, & technical services All other specialty trade contractors Apparel mfg. Apparel, piece goods, & notions Architectural services Automotive body, paint, interior, & glass repair Automotive equipment rental & leasing Automotive mechanical & electrical repair & maintenance Automotive parts, accessories, & tire stores Bakeries & tortilla mfg. Barber shops Beer, wine & liquor stores Beer, wine, & distilled alcoholic beverage Book stores Books, periodicals, & newspapers Business service centers (including private mail centers & copy shops) Carpentry Contractors (1997 NAICS) Carpet & upholstery cleaning services Chemical & allied products Child day care services Children's & infants' clothing stores Clothing accessories stores Coin-operated laundries & drycleaners Commercial & industrial machinery & equipment (except automotive & electronic) repair & maintenance Commercial & industrial machinery & equipment rental & leasing Community food & housing, & emergency & other relief services Computer & electronic product mfg. Computer & software stores Computer systems design & related services Consumer electronics & appliances rental Cosmetics, beauty supplies, & perfume stores Couriers & messengers Data processing, hosting, & related services Drinking places (alcoholic beverages) Drugs & druggists' sundries Drycleaning & laundry services (except coin-operated) (including laundry & drycleaning drop-off & pickup sites) Drywall, Plastering, Acoustical, and Insulation Contractors (1997 NAICS) Educational services (including schools, colleges, & universities) Electrical & electronic goods Electrical Contractors (1997 NAICS) Electronic shopping Employment services Engineering services Exterminating & pest control services Fabricated metal product mfg. Facilities support (management) services Family clothing stores Floor Laying and Other Floor Contractors (1997 NAICS) Florists Flower, nursery stock, & florists' supplies Footwear & leather goods repair Fruit & vegetable markets Full-service restaurants Furniture & home furnishing Furniture & related product mfg. Furniture stores Gasoline stations (including convenience stores with gas) General freight trucking, local General merchandise stores Gift, novelty, & souvenir stores Glass and Glazing Contractors (1997 NAICS) Grocery & related products Grocery stores (including supermarkets & convenience stores without gas) Hardware stores Hardware, & plumbing & heating equipment & supplies Home furnishings stores Hospitals Household appliance stores Independent artists, writers, & performers Individual & family services Insurance agencies & brokerages Internet publishing & broadcasting Internet service providers Investigation & security services Janitorial services Jewelry stores Jewelry, watch, precious stone, & precious metals Landscape architecture services Landscaping services Legal services Lessors of real estate (including mini warehouses & self-storage units) Limited-service eating places Lumber & other construction materials Machinery, equipment, & supplies Management, scientific, & technical consulting services Manufacturing and Industrial Building Construction (1997 NAICS) Medical & diagnostic laboratories Metal & mineral (except petroleum) Motion picture & video industries (except video rental) Motor vehicle & motor vehicle parts & supplies Multifamily Housing Construction (1997 NAICS) Musical instrument & supplies stores Nail salons New car dealers Nondepository credit intermediation (including sales financing & consumer lending) Nursing & residential care facilities Office administrative services Office supplies & stationery stores Offices of all other miscellaneous health practitioners Offices of certified public accountants Offices of chiropractors Offices of dentists Offices of mental health practitioners (except physicians) Offices of optometrists Offices of physical, occupational & speech therapists, & audiologists Offices of physicians (except mental health specialists) Offices of podiatrists Offices of real estate agents & brokers Optical goods stores Other Clothing Stores Other accounting services Other ambulatory health care services (including ambulance services, blood, & organ banks) Other amusement & recreation services (including golf courses, skiing facilities, marinas, fitness centers, bowling centers, skating rinks, miniature golf courses) Other automotive repair & maintenance (including oil change & lubrication shops & car washes) Other building equipment contractors Other building materials dealers Other business support services (including repossession services, court reporting, & stenotype services) Other consumer goods rental Other direct selling establishments (including door-to-door retailing, frozen food plan providers, party plan merchandisers, & coffee-break service providers) Other financial investment activities (including investment advice) Other food mfg. (including coffee, tea, flavoring, & seasonings) Other health & personal care stores Other insurance related activities Other miscellaneous durable goods Other miscellaneous nondurable goods Other personal & household goods repair & maintenance Other personal care services (including diet & weight reducing centers) Outpatient care centers Paint & wallpaper stores Paint, varnish, & supplies Painting and Wall Covering Contractors (1997 NAICS) Paper & paper products Paper mfg. Parking lots & garages Pet care (except veterinary) services Pharmacies & drug stores Photographic services Plumbing, Heating, and Air-Conditioning Contractors (1997 NAICS) Poured concrete foundation & structure contractors Professional & commercial equipment & supplies Promoters of performing arts, sports, & similar events Radio, television, & other electronics stores Real estate property managers Recyclable materials Reupholstery & furniture repair Roofing, Siding, and Sheet Metal Contractors (1997 NAICS) Rooming & boarding houses Scientific research & development services Securities brokers Shoe stores Single Family Housing Construction (1997 NAICS) Sound recording industries Special food services (including food service contractors & caterers) Specialized design services (including interior, industrial, graphic, & fashion design) Specialized freight trucking (including household moving vans) Sporting goods stores Structural Steel Erection Contractors (1997 NAICS) Support activities for animal production (including farriers) Support activities for transportation (including motor vehicle towing) Tax preparation services Textile product mills Tile & terrazzo contractors Toy & hobby goods & supplies Translation & interpretation services Travel arrangement & reservation services Traveler accommodation (including hotels, motels, & bed & breakfast inns) Unclassified establishments (unable to classify) Used car dealers Used merchandise stores Vending machine operators Veterinary services Video tape & disc rental Warehousing & storage (except leases of mini warehouses & self-storage units) Waste management & remediation services Women's clothing stores
0 Athens, Los Angeles (South Los Angeles) 0.0 0.000000 0.0 0.035088 0.0 0.017544 0.0 0.000000 0.000000 0.000000 0.0 0.0 0.017544 0.0 0.000000 0.0 0.0 0.000000 0.0 0.0 0.0 0.0 0.017544 0.0 0.000000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.035088 0.000000 0.017544 0.0 0.000000 0.000000 0.0 0.017544 0.0 0.000000 0.0 0.0 0.0 0.0 0.017544 0.035088 0.0 0.017544 0.000000 0.0 0.000000 0.052632 0.000000 0.0 0.0 0.052632 0.0 0.0 0.017544 0.0 0.000000 0.0 0.035088 0.000000 0.000000 0.0 0.000000 0.035088 0.0 0.0 0.0 0.000000 0.0 0.228070 0.017544 0.0 0.0 0.000000 0.017544 0.0 0.000000 0.0 0.017544 0.0 0.0 0.0 0.0 0.0 0.017544 0.000000 0.0 0.017544 0.0 0.000000 0.0 0.0 0.0 0.0 0.017544 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.035088 0.0 0.000000 0.017544 0.0 0.000000 0.017544 0.0 0.0 0.017544 0.0 0.000000 0.017544 0.0 0.0 0.000000 0.0 0.000000 0.000000 0.000000 0.0 0.000000 0.0 0.0 0.0 0.0 0.0 0.0 0.017544 0.0 0.0 0.0 0.035088 0.017544 0.000000 0.000000 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 0.0 0.0 0.035088 0.017544 0.017544 0.000000 0.000000 0.0 0.0 0.000000 0.000000 0.000000 0.0
1 City Terrace, Los Angeles (Boyle Heights) 0.0 0.000000 0.0 0.142857 0.0 0.000000 0.0 0.000000 0.000000 0.000000 0.0 0.0 0.000000 0.0 0.071429 0.0 0.0 0.000000 0.0 0.0 0.0 0.0 0.000000 0.0 0.071429 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.000000 0.000000 0.000000 0.0 0.000000 0.000000 0.0 0.000000 0.0 0.000000 0.0 0.0 0.0 0.0 0.000000 0.071429 0.0 0.000000 0.000000 0.0 0.000000 0.000000 0.000000 0.0 0.0 0.071429 0.0 0.0 0.000000 0.0 0.071429 0.0 0.000000 0.000000 0.000000 0.0 0.000000 0.071429 0.0 0.0 0.0 0.000000 0.0 0.071429 0.000000 0.0 0.0 0.000000 0.000000 0.0 0.000000 0.0 0.142857 0.0 0.0 0.0 0.0 0.0 0.000000 0.000000 0.0 0.000000 0.0 0.000000 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0 0.000000 0.000000 0.0 0.071429 0.000000 0.0 0.0 0.000000 0.0 0.071429 0.000000 0.0 0.0 0.000000 0.0 0.000000 0.071429 0.000000 0.0 0.000000 0.0 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 0.0 0.000000 0.000000 0.000000 0.000000 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 0.0 0.0 0.000000 0.000000 0.000000 0.000000 0.000000 0.0 0.0 0.000000 0.000000 0.000000 0.0
2 Commerce, City of 0.0 0.000000 0.0 0.000000 0.0 0.000000 0.0 0.000000 0.000000 0.000000 0.0 0.0 0.000000 0.0 0.000000 0.0 0.0 0.000000 0.0 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.000000 0.000000 0.000000 0.0 1.000000 0.000000 0.0 0.000000 0.0 0.000000 0.0 0.0 0.0 0.0 0.000000 0.000000 0.0 0.000000 0.000000 0.0 0.000000 0.000000 0.000000 0.0 0.0 0.000000 0.0 0.0 0.000000 0.0 0.000000 0.0 0.000000 0.000000 0.000000 0.0 0.000000 0.000000 0.0 0.0 0.0 0.000000 0.0 0.000000 0.000000 0.0 0.0 0.000000 0.000000 0.0 0.000000 0.0 0.000000 0.0 0.0 0.0 0.0 0.0 0.000000 0.000000 0.0 0.000000 0.0 0.000000 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0 0.000000 0.000000 0.0 0.000000 0.000000 0.0 0.0 0.000000 0.0 0.000000 0.000000 0.0 0.0 0.000000 0.0 0.000000 0.000000 0.000000 0.0 0.000000 0.0 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 0.0 0.000000 0.000000 0.000000 0.000000 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 0.0 0.0 0.000000 0.000000 0.000000 0.000000 0.000000 0.0 0.0 0.000000 0.000000 0.000000 0.0
3 Commerce, East Los Angeles 0.0 0.000000 0.0 0.034483 0.0 0.000000 0.0 0.068966 0.034483 0.000000 0.0 0.0 0.000000 0.0 0.034483 0.0 0.0 0.000000 0.0 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.034483 0.0 0.034483 0.000000 0.068966 0.0 0.000000 0.000000 0.0 0.000000 0.0 0.000000 0.0 0.0 0.0 0.0 0.000000 0.034483 0.0 0.000000 0.034483 0.0 0.034483 0.000000 0.034483 0.0 0.0 0.000000 0.0 0.0 0.034483 0.0 0.000000 0.0 0.000000 0.000000 0.000000 0.0 0.000000 0.000000 0.0 0.0 0.0 0.000000 0.0 0.103448 0.000000 0.0 0.0 0.000000 0.000000 0.0 0.034483 0.0 0.034483 0.0 0.0 0.0 0.0 0.0 0.000000 0.000000 0.0 0.068966 0.0 0.000000 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.034483 0.0 0.000000 0.000000 0.0 0.000000 0.000000 0.0 0.0 0.034483 0.0 0.000000 0.000000 0.0 0.0 0.000000 0.0 0.034483 0.000000 0.000000 0.0 0.000000 0.0 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 0.0 0.000000 0.000000 0.034483 0.000000 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 0.0 0.0 0.000000 0.000000 0.000000 0.034483 0.000000 0.0 0.0 0.034483 0.034483 0.034483 0.0
4 Culver City, Los Angeles (Mar Vista) 0.0 0.030303 0.0 0.015152 0.0 0.015152 0.0 0.000000 0.000000 0.030303 0.0 0.0 0.000000 0.0 0.000000 0.0 0.0 0.015152 0.0 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.060606 0.015152 0.000000 0.0 0.015152 0.045455 0.0 0.000000 0.0 0.015152 0.0 0.0 0.0 0.0 0.000000 0.000000 0.0 0.000000 0.000000 0.0 0.000000 0.015152 0.000000 0.0 0.0 0.000000 0.0 0.0 0.000000 0.0 0.000000 0.0 0.000000 0.030303 0.015152 0.0 0.015152 0.000000 0.0 0.0 0.0 0.015152 0.0 0.121212 0.000000 0.0 0.0 0.030303 0.000000 0.0 0.000000 0.0 0.000000 0.0 0.0 0.0 0.0 0.0 0.000000 0.015152 0.0 0.000000 0.0 0.015152 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 0.0 0.0 0.060606 0.0 0.0 0.0 0.0 0.0 0.015152 0.0 0.015152 0.0 0.030303 0.000000 0.0 0.000000 0.030303 0.0 0.0 0.000000 0.0 0.000000 0.000000 0.0 0.0 0.030303 0.0 0.000000 0.030303 0.045455 0.0 0.015152 0.0 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 0.0 0.030303 0.000000 0.000000 0.015152 0.0 0.0 0.0 0.0 0.0 0.015152 0.0 0.0 0.0 0.0 0.121212 0.000000 0.000000 0.000000 0.015152 0.0 0.0 0.000000 0.000000 0.000000 0.0
In [35]:
la_grouped.shape
Out[35]:
(55, 181)
In [36]:
num_top_venues = 5

for hood in la_grouped['Neighborhood']:
    #print("----"+hood+"----")
    temp = la_grouped[la_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    #print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    #print('\n')
In [37]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]
In [38]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = la_grouped['Neighborhood']

for ind in np.arange(la_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(la_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()
Out[38]:
Neighborhood 1st Most Common Venue 2nd Most Common Venue 3rd Most Common Venue 4th Most Common Venue 5th Most Common Venue 6th Most Common Venue 7th Most Common Venue 8th Most Common Venue 9th Most Common Venue 10th Most Common Venue
0 Athens, Los Angeles (South Los Angeles) Lessors of real estate (including mini warehou... Grocery stores (including supermarkets & conve... General merchandise stores Full-service restaurants Individual & family services Single Family Housing Construction (1997 NAICS) Travel arrangement & reservation services Janitorial services Educational services (including schools, colle... All other miscellaneous store retailers (inclu...
1 City Terrace, Los Angeles (Boyle Heights) Motor vehicle & motor vehicle parts & supplies All other miscellaneous store retailers (inclu... Household appliance stores Bakeries & tortilla mfg. Full-service restaurants Grocery stores (including supermarkets & conve... Janitorial services Photographic services Child day care services Paint, varnish, & supplies
2 Commerce, City of Employment services Women's clothing stores Fruit & vegetable markets Gift, novelty, & souvenir stores General merchandise stores General freight trucking, local Gasoline stations (including convenience store... Furniture stores Furniture & related product mfg. Furniture & home furnishing
3 Commerce, East Los Angeles Lessors of real estate (including mini warehou... Offices of all other miscellaneous health prac... Electrical Contractors (1997 NAICS) Apparel mfg. Other direct selling establishments (including... Metal & mineral (except petroleum) Furniture stores Gift, novelty, & souvenir stores Pharmacies & drug stores Full-service restaurants
4 Culver City, Los Angeles (Mar Vista) Lessors of real estate (including mini warehou... Travel arrangement & reservation services Other accounting services Educational services (including schools, colle... Plumbing, Heating, and Air-Conditioning Contra... Engineering services Single Family Housing Construction (1997 NAICS) Photographic services Parking lots & garages Insurance agencies & brokerages
In [ ]:
 

Visuzalizing the active businesses on the map of LA, with businesses close to competitors removed

In [39]:
address = 'Los Angeles, CA'

geolocator = Nominatim(user_agent="ca_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Los Angeles are {}, {}.'.format(latitude, longitude))



map_la_businesses = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, name in zip(la_businesses_master['LATITUDE'], la_businesses_master['LONGITUDE'], la_businesses_master['BUSINESS NAME']):
    label = '{}'.format(name)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=2,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.5,
        parse_html=False).add_to(map_la_businesses)  
    
map_la_businesses
The geograpical coordinate of Los Angeles are 34.0536909, -118.2427666.
Out[39]:
Make this Notebook Trusted to load map: File -> Trust Notebook

In this map, blue markers indicate active businesses, and red markers indicate the competing Cuban restaurants. Understandably, the Cuban restaurants are in the center of the 'no-business' circle.

In [40]:
address = 'Los Angeles, CA'

geolocator = Nominatim(user_agent="ca_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Los Angeles are {}, {}.'.format(latitude, longitude))


#cuban_businesses = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, name in zip(cuban_competitors['lat'], cuban_competitors['lng'], cuban_competitors['name']):
    label = '{}'.format(name)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=2,
        popup=label,
        color='red',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.5,
        parse_html=False).add_to(map_la_businesses)  
    
map_la_businesses
The geograpical coordinate of Los Angeles are 34.0536909, -118.2427666.
Out[40]:
Make this Notebook Trusted to load map: File -> Trust Notebook
In [41]:
la_businesses_master.head()
Out[41]:
BUSINESS NAME STREET ADDRESS CITY ZIP_CODE PRIMARY NAICS DESCRIPTION LATITUDE LONGITUDE NEIGHBORHOOD close_to_competitor
0 PALACE OF VENICE GUEST HOME /C 1727 CRENSHAW BLVD LOS ANGELES 90019 Rooming & boarding houses 34.0425 -118.3295 Los Angeles (Arlington Heights, Country Club P... False
2 A A OFICINA CENTRAL HISPANA DE LOS ANGELES /C 4473 W PICO BLVD LOS ANGELES 90019 Educational services (including schools, colle... 34.0483 -118.3329 Los Angeles (Arlington Heights, Country Club P... False
3 MATTHEW K MARCY 1278 QUEEN ANNE PLACE LOS ANGELES 90019 Lessors of real estate (including mini warehou... 34.0484 -118.3326 Los Angeles (Arlington Heights, Country Club P... False
4 SPECIAL SERVICE FOR GROUPS 1310 S ST ANDREWS PLACE LOS ANGELES 90019 Individual & family services 34.0470 -118.3116 Los Angeles (Arlington Heights, Country Club P... False
5 NAG WOO SUNG 1619 4TH AVENUE LOS ANGELES 90019 Plumbing, Heating, and Air-Conditioning Contra... 34.0431 -118.3212 Los Angeles (Arlington Heights, Country Club P... False

Location based Clustering, to identify possible neighborhood clusters to operate the truck from

In [50]:
# set number of clusters
kclusters = 4

la_grouped_clustering = la_businesses_master[['LATITUDE', 'LONGITUDE']]

# run k-means clustering
kmeans = KMeans(init="k-means++", n_clusters=kclusters, n_init=22).fit(la_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 
Out[50]:
array([3, 3, 3, 3, 3, 3, 3, 3, 3, 3], dtype=int32)
In [53]:
la_grouped_clustering.head()
Out[53]:
LATITUDE LONGITUDE
0 34.0425 -118.3295
2 34.0483 -118.3329
3 34.0484 -118.3326
4 34.0470 -118.3116
5 34.0431 -118.3212
In [54]:
# add clustering labels
#la_businesses_master.insert(0, 'Cluster Labels', kmeans.labels_)

#la_merged = la_grouped_clustering
#la_merged.rename(columns={'NEIGHBORHOOD':'Neighborhood'}, inplace=True)

#la_merged.insert(0, 'Cluster Labels', kmeans.labels_)

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
#la_merged = la_merged.join(la_businesses_master.set_index('Neighborhood'), on='Neighborhood')

#la_merged.head() # check the last columns!
la_businesses_master.head()
Out[54]:
Cluster Labels BUSINESS NAME STREET ADDRESS CITY ZIP_CODE PRIMARY NAICS DESCRIPTION LATITUDE LONGITUDE NEIGHBORHOOD close_to_competitor
0 0 PALACE OF VENICE GUEST HOME /C 1727 CRENSHAW BLVD LOS ANGELES 90019 Rooming & boarding houses 34.0425 -118.3295 Los Angeles (Arlington Heights, Country Club P... False
2 0 A A OFICINA CENTRAL HISPANA DE LOS ANGELES /C 4473 W PICO BLVD LOS ANGELES 90019 Educational services (including schools, colle... 34.0483 -118.3329 Los Angeles (Arlington Heights, Country Club P... False
3 0 MATTHEW K MARCY 1278 QUEEN ANNE PLACE LOS ANGELES 90019 Lessors of real estate (including mini warehou... 34.0484 -118.3326 Los Angeles (Arlington Heights, Country Club P... False
4 0 SPECIAL SERVICE FOR GROUPS 1310 S ST ANDREWS PLACE LOS ANGELES 90019 Individual & family services 34.0470 -118.3116 Los Angeles (Arlington Heights, Country Club P... False
5 0 NAG WOO SUNG 1619 4TH AVENUE LOS ANGELES 90019 Plumbing, Heating, and Air-Conditioning Contra... 34.0431 -118.3212 Los Angeles (Arlington Heights, Country Club P... False
In [55]:
la_merged.head()
Out[55]:
Cluster Labels LATITUDE LONGITUDE
0 0 34.0425 -118.3295
2 0 34.0483 -118.3329
3 0 34.0484 -118.3326
4 0 34.0470 -118.3116
5 0 34.0431 -118.3212

Map with all the location clusters

In [56]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, cluster in zip(la_merged['LATITUDE'], la_merged['LONGITUDE'], la_merged['Cluster Labels']):
    #label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    label = folium.Popup(' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters
Out[56]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Examining Clusters

Count of businesses in each cluster can be seen below. With one of these clusters having way lesser number of businesses, it is clear that those neighborhoods can be of the least priority to sell Cuban food

In [59]:
(la_businesses_master.loc[la_businesses_master['Cluster Labels'] == 0, la_businesses_master.columns[[1] + list(range(5, la_businesses_master.shape[1]))]]).shape
Out[59]:
(888, 6)
In [60]:
(la_businesses_master.loc[la_businesses_master['Cluster Labels'] == 1, la_businesses_master.columns[[1] + list(range(5, la_businesses_master.shape[1]))]]).shape
Out[60]:
(25, 6)
In [61]:
(la_businesses_master.loc[la_businesses_master['Cluster Labels'] == 2, la_businesses_master.columns[[1] + list(range(5, la_businesses_master.shape[1]))]]).shape
Out[61]:
(839, 6)
In [62]:
(la_businesses_master.loc[la_businesses_master['Cluster Labels'] == 3, la_businesses_master.columns[[1] + list(range(5, la_businesses_master.shape[1]))]]).shape
Out[62]:
(378, 6)

The neighborhoods below are arranged in descending order, displaying neighborhoods with most businesses first, in each cluster. Thus, the ideal locations for selling Cuban food has now been determined.

In [68]:
(la_businesses_master.loc[la_businesses_master['Cluster Labels'] == 0, la_businesses_master.columns[[1] + list(range(5, la_businesses_master.shape[1]))]]).groupby('NEIGHBORHOOD').count().sort_values('BUSINESS NAME',ascending=False)
Out[68]:
BUSINESS NAME PRIMARY NAICS DESCRIPTION LATITUDE LONGITUDE close_to_competitor
NEIGHBORHOOD
Los Angeles (South Los Angeles) 146 146 146 146 146
Los Angeles (Jefferson Park, Leimert Park) 69 69 69 69 69
Los Angeles (Hancock Park, Wilshire Center, Windsor Square) 65 65 65 65 65
Los Angeles (South Los Angeles, Southeast Los Angeles) 64 64 64 64 64
Los Angeles (Baldwin Hills, Crenshaw, Leimert Park) 63 63 63 63 63
Los Angeles (Hyde Park, View Park, Windsor Hills) 61 61 61 61 61
Athens, Los Angeles (South Los Angeles) 57 57 57 57 57
Los Angeles (Arlington Heights, Country Club Park, Mid-City) 53 53 53 53 53
Los Angeles (Fairfax, Melrose, Miracle Mile, Park La Brea, Wilshire-La Brea) 40 40 40 40 40
Los Angeles (West Adams) 35 35 35 35 35
Los Angeles (Southeast Los Angeles) 30 30 30 30 30
Los Angeles (Griffith Park, Hollywood, Los Feliz) 27 27 27 27 27
Los Angeles (Southeast Los Angeles, Watts), Willowbrook 26 26 26 26 26
Los Angeles (Hancock Park, Western Wilton, Wilshire Center, Windsor Square) 26 26 26 26 26
Los Angeles (Hancock Park, Koreatown, Wilshire Center, Wilshire Park, Windsor Square) 23 23 23 23 23
Los Angeles (Byzantine-Latino Quarter, Harvard Heights, Koreatown, Pico Heights) 23 23 23 23 23
Los Angeles (Southeast Los Angeles, Watts) 17 17 17 17 17
Los Angeles (South Los Angeles), Florence-Graham 15 15 15 15 15
Los Angeles (East Hollywood) 13 13 13 13 13
Los Angeles (Hollywood) 11 11 11 11 11
Los Angeles (Hancock Park, Rampart Village, Virgil Village, Wilshire Center, Windsor Square) 6 6 6 6 6
Los Angeles (Southeast Los Angeles), Vernon 5 5 5 5 5
Ladera Heights 4 4 4 4 4
Los Angeles (Los Angeles International Airport, Westchester) 3 3 3 3 3
Los Angeles (Southeast Los Angeles, Univerity Park) 3 3 3 3 3
Los Angeles (Hollywood), West Hollywood 2 2 2 2 2
Los Angeles (Echo Park, Silver Lake) 1 1 1 1 1
In [69]:
(la_businesses_master.loc[la_businesses_master['Cluster Labels'] == 1, la_businesses_master.columns[[1] + list(range(5, la_businesses_master.shape[1]))]]).groupby('NEIGHBORHOOD').count().sort_values('BUSINESS NAME',ascending=False)
Out[69]:
BUSINESS NAME PRIMARY NAICS DESCRIPTION LATITUDE LONGITUDE close_to_competitor
NEIGHBORHOOD
Los Angeles (Downtown Central, Downtown Fashion District) 4 4 4 4 4
Los Angeles (Baldwin Hills, Crenshaw, Leimert Park) 2 2 2 2 2
Los Angeles (South Los Angeles), Florence-Graham 2 2 2 2 2
Los Angeles (Palms) 2 2 2 2 2
Los Angeles (Mid-City West), West Hollywood 2 2 2 2 2
Los Angeles (Hollywood) 1 1 1 1 1
Los Angeles (West Fairfax) 1 1 1 1 1
Los Angeles (Lincoln Heights, Montecito Heights) 1 1 1 1 1
Los Angeles (Hyde Park, View Park, Windsor Hills) 1 1 1 1 1
Los Angeles (Hancock Park, Koreatown, Wilshire Center, Wilshire Park, Windsor Square) 1 1 1 1 1
Los Angeles (Bel Air Estates, Brentwood) 1 1 1 1 1
Los Angeles (Dowtown Fashion District, South Park-South) 1 1 1 1 1
Los Angeles (Downtown Historic Core, Arts District) 1 1 1 1 1
Los Angeles (Downtown Civic Center, Chinatown, Arts District, Bunker Hill, Historic Core, Little Tokyo) 1 1 1 1 1
Los Angeles (Downtown Bunker Hill, City West, Historic Core, South Park-North) 1 1 1 1 1
Los Angeles (Cypress Park, Glassell Park, Mt Washington) 1 1 1 1 1
Los Angeles (Boyle Heights) 1 1 1 1 1
Torrance 1 1 1 1 1
In [70]:
(la_businesses_master.loc[la_businesses_master['Cluster Labels'] == 2, la_businesses_master.columns[[1] + list(range(5, la_businesses_master.shape[1]))]]).groupby('NEIGHBORHOOD').count().sort_values('BUSINESS NAME',ascending=False)
Out[70]:
BUSINESS NAME PRIMARY NAICS DESCRIPTION LATITUDE LONGITUDE close_to_competitor
NEIGHBORHOOD
Los Angeles (Sawtelle, West Los Angeles) 234 234 234 234 234
Los Angeles (Westwood) 123 123 123 123 123
Los Angeles (Cheviot Hills, Rancho Park) 113 113 113 113 113
Los Angeles (Bel Air Estates, Brentwood) 105 105 105 105 105
Los Angeles (Century City) 79 79 79 79 79
Culver City, Los Angeles (Mar Vista) 66 66 66 66 66
Los Angeles (Hollywood), West Hollywood 34 34 34 34 34
Los Angeles (Hollywood, Melrose), West Hollywood 30 30 30 30 30
Los Angeles (Los Angeles International Airport, Westchester) 17 17 17 17 17
Los Angeles (Bel Air Estates, Beverly Glen) 15 15 15 15 15
Los Angeles (Fairfax, Melrose, Miracle Mile, Park La Brea, Wilshire-La Brea) 10 10 10 10 10
Los Angeles (Hollywood) 9 9 9 9 9
Los Angeles (Mid-City West), West Hollywood 3 3 3 3 3
Los Angeles (University of California Los Angeles) 1 1 1 1 1
In [71]:
(la_businesses_master.loc[la_businesses_master['Cluster Labels'] == 3, la_businesses_master.columns[[1] + list(range(5, la_businesses_master.shape[1]))]]).groupby('NEIGHBORHOOD').count().sort_values('BUSINESS NAME',ascending=False)
Out[71]:
BUSINESS NAME PRIMARY NAICS DESCRIPTION LATITUDE LONGITUDE close_to_competitor
NEIGHBORHOOD
Los Angeles (Highland Park) 71 71 71 71 71
Los Angeles (El Sereno, Monterey Hills, University Hills) 47 47 47 47 47
Los Angeles (Eagle Rock) 47 47 47 47 47
Los Angeles (Lincoln Heights, Montecito Heights) 43 43 43 43 43
Los Angeles (Boyle Heights) 36 36 36 36 36
Los Angeles (Cypress Park, Glassell Park, Mt Washington) 33 33 33 33 33
Commerce, East Los Angeles 29 29 29 29 29
Los Angeles (Downtown Civic Center, Chinatown, Arts District, Bunker Hill, Historic Core, Little Tokyo) 22 22 22 22 22
City Terrace, Los Angeles (Boyle Heights) 14 14 14 14 14
East Los Angeles 12 12 12 12 12
Los Angeles (Downtown Central, Downtown Fashion District) 2 2 2 2 2
Los Angeles (Fairfax, Melrose, Miracle Mile, Park La Brea, Wilshire-La Brea) 2 2 2 2 2
Los Angeles (Griffith Park, Hollywood, Los Feliz) 2 2 2 2 2
Los Angeles (Southeast Los Angeles), Vernon 2 2 2 2 2
Los Angeles (Mid-City West), West Hollywood 2 2 2 2 2
Los Angeles (Echo Park, Silver Lake) 2 2 2 2 2
Los Angeles (Jefferson Park, Leimert Park) 1 1 1 1 1
Los Angeles (West Fairfax) 1 1 1 1 1
Los Angeles (Southeast Los Angeles, Univerity Park) 1 1 1 1 1
Los Angeles (Palms) 1 1 1 1 1
Los Angeles (Hancock Park, Koreatown, Wilshire Center, Wilshire Park, Windsor Square) 1 1 1 1 1
Los Angeles (Hollywood) 1 1 1 1 1
Los Angeles (Hancock Park, Wilshire Center, Windsor Square) 1 1 1 1 1
Los Angeles (Hancock Park, Western Wilton, Wilshire Center, Windsor Square) 1 1 1 1 1
Los Angeles (Hancock Park, Rampart Village, Virgil Village, Wilshire Center, Windsor Square) 1 1 1 1 1
Commerce, City of 1 1 1 1 1
Los Angeles (Cheviot Hills, Rancho Park) 1 1 1 1 1
Los Angeles (Westwood) 1 1 1 1 1
In [ ]: